A Study of Chinese Lexical Analysis Based on Discriminative Models
نویسندگان
چکیده
This paper briefly describes our system in The Fourth SIGHAN Bakeoff. Discriminative models including maximum entropy model and conditional random fields are utilized in Chinese word segmentation and named entity recognition with different tag sets and features. Transformation-based learning model is used in part-of-speech tagging. Evaluation shows that our system achieves the F-scores: 92.64% and 92.73% in NCC Word Segmentation close and open tests, 89.11% in MSRA name entity recognition open test, 91.13% and 91.97% in PKU part-of-speech tagging close and open tests. All the results get medium performances on the bakeoff tracks.
منابع مشابه
The effects of task complexity on Chinese learners’ language production: A synthesis and meta-analysis
The present meta-analysis was conducted to provide a quantitative measure of the overall effects of task complexity on Chinese EFL learners’ language production. Based on the strict inclusion criteria, 12 primary studies were synthesized according to key features. Eleven of them were meta-analyzed to investigate effects of raising the resource-directing task comple...
متن کاملCapturing Paradigmatic and Syntagmatic Lexical Relations: Towards Accurate Chinese Part-of-Speech Tagging
From the perspective of structural linguistics, we explore paradigmatic and syntagmatic lexical relations for Chinese POS tagging, an important and challenging task for Chinese language processing. Paradigmatic lexical relations are explicitly captured by word clustering on large-scale unlabeled data and are used to design new features to enhance a discriminative tagger. Syntagmatic lexical rel...
متن کاملLearning Chinese language structures with multiple views
Motivated by the inadequacy of single view approaches in many areas in NLP, we study multi-view Chinese language processing, including word segmentation, part-of-speech (POS) tagging, syntactic parsing and semantic role labeling (SRL), in this thesis. We consider three situations of multiple views in statistical NLP: (1) Heterogeneous computational models have been designed for a given problem;...
متن کاملTowards Accurate and Efficient Chinese Part-of-Speech Tagging
From the perspective of structural linguistics, we explore paradigmatic and syntagmatic lexical relations for Chinese POS tagging, an important and challenging task for Chinese language processing. Paradigmatic lexical relations are explicitly captured by word clustering on largescale unlabeled data and are used to design new features to enhance a discriminative tagger. Syntagmatic lexical rela...
متن کاملThe Effect of Interaction on Lexical Acquisition
This research showed that appropriate input and suitable contexts for interaction among students can lead to successful second language acquisition (SLA). This study based on Swain's (2005) notion of collaborative dialogue, aimed to study whether EFL learners participating in negotiation of meaning based tasks collaborate with each other and, if so, to investigate the role of this behavior in ...
متن کامل